Reading the file toto.csv
Read csv file: toto.csv
args: {'encoding': 'utf-8-sig', 'sep': ',', 'decimal': ',', 'engine': 'python', 'filepath_or_buffer': 'toto.csv', 'thousands': '.', 'parse_dates': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'predict'], 'infer_datetime_format': True}
Inital dtypes is a float64
b float64
c float64
d float64
e float64
f float64
g float64
h float64
predict int64
dtype: object
Work on PolynomialFeatures: degree 1
Optimal number of clusters
(10000, 9)
Polynomial Features: generate a new feature matrix
consisting of all polynomial combinations of the features.
For 2 features [a, b]:
the degree 1 polynomial give [a, b]
the degree 2 polynomial give [1, a, b, a^2, ab, b^2]
...
ELBOW: explain the variance as a function of clusters.
OOB: this is the average error for each training observations,
calculted using the trees that doesn't contains this observation
during the creation of the tree.
Estimator ExtraTreesClassifier
ExtraTreesClassifier: as in random forests, a random subset of candidate
features is used, but instead of looking for the most discriminative
thresholds, thresholds are drawn at random for each candidate feature and
the best of these randomly-generated thresholds is picked as
the splitting rule.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 9.1s finished
Best params => {'n_estimators': 100, 'min_samples_split': 4, 'min_samples_leaf': 1, 'max_features': 0.6, 'criterion': 'entropy', 'bootstrap': False}
Best Score => 0.865
Estimator XGBClassifier
Gradient boosting is an approach where new models are created that predict
the residuals or errors of prior models and then added together to make
the final prediction. It is called gradient boosting because it uses a
gradient descent algorithm to minimize the loss when adding new models.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 38.4s finished
Best params => {'subsample': 0.9, 'n_estimators': 50, 'min_child_weight': 6, 'max_depth': 8, 'learning_rate': 0.5}
Best Score => 0.855
Estimator KNeighborsClassifier
KNeighborsClassifier: Majority vote of its k nearest neighbors.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
Fitting 3 folds for each of 4 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 4.1s finished
Best params => {'n_neighbors': 17, 'p': 2, 'weights': 'distance'}
Best Score => 0.853
Estimator DecisionTreeClassifier
Decision Tree Classifier: poses a series of carefully crafted questions
about the attributes of the test record. Each time time it receive an answer,
a follow-up question is asked until a conclusion about the calss label
of the record is reached.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 2.0s finished
Best params => {'min_samples_split': 5, 'min_samples_leaf': 2, 'max_depth': 10, 'criterion': 'entropy'}
Best Score => 0.750
Check the decision tree: 2017-08-1813:13:19.847449.png
Work on PolynomialFeatures: degree 2
Optimal number of clusters
dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.880171 to fit
Polynomial Features: generate a new feature matrix
consisting of all polynomial combinations of the features.
For 2 features [a, b]:
the degree 1 polynomial give [a, b]
the degree 2 polynomial give [1, a, b, a^2, ab, b^2]
...
ELBOW: explain the variance as a function of clusters.
OOB: this is the average error for each training observations,
calculted using the trees that doesn't contains this observation
during the creation of the tree.
Estimator ExtraTreesClassifier
ExtraTreesClassifier: as in random forests, a random subset of candidate
features is used, but instead of looking for the most discriminative
thresholds, thresholds are drawn at random for each candidate feature and
the best of these randomly-generated thresholds is picked as
the splitting rule.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 17.0s finished
Best params => {'n_estimators': 50, 'min_samples_split': 3, 'min_samples_leaf': 1, 'max_features': 0.1, 'criterion': 'gini', 'bootstrap': False}
Best Score => 0.857
Estimator XGBClassifier
Gradient boosting is an approach where new models are created that predict
the residuals or errors of prior models and then added together to make
the final prediction. It is called gradient boosting because it uses a
gradient descent algorithm to minimize the loss when adding new models.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 1.3min finished
Best params => {'subsample': 0.9, 'n_estimators': 100, 'min_child_weight': 7, 'max_depth': 4, 'learning_rate': 0.5}
Best Score => 0.857
Estimator KNeighborsClassifier
KNeighborsClassifier: Majority vote of its k nearest neighbors.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
Fitting 3 folds for each of 4 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 36.3s finished
Best params => {'n_neighbors': 11, 'p': 2, 'weights': 'distance'}
Best Score => 0.853
Estimator DecisionTreeClassifier
Decision Tree Classifier: poses a series of carefully crafted questions
about the attributes of the test record. Each time time it receive an answer,
a follow-up question is asked until a conclusion about the calss label
of the record is reached.
Fitting 3 folds for each of 10 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Done 30 out of 30 | elapsed: 7.5s finished
Best params => {'min_samples_split': 7, 'min_samples_leaf': 8, 'max_depth': 6, 'criterion': 'gini'}
Best Score => 0.738
Check the decision tree: 2017-08-1813:18:56.364832.png
Estimator Score Degree
0 (ExtraTreeClassifier(class_weight=None, criter... 0.864667 1
1 XGBClassifier(base_score=0.5, colsample_byleve... 0.856800 2
2 (ExtraTreeClassifier(class_weight=None, criter... 0.856667 2
3 XGBClassifier(base_score=0.5, colsample_byleve... 0.855333 1
4 KNeighborsClassifier(algorithm='auto', leaf_si... 0.853333 1
5 KNeighborsClassifier(algorithm='auto', leaf_si... 0.852933 2
6 DecisionTreeClassifier(class_weight=None, crit... 0.750400 1
7 DecisionTreeClassifier(class_weight=None, crit... 0.737867 2
Stacking: is a model ensembling technique used to combine information
from multiple predictive models to generate a new model.
task: [classification]
metric: [accuracy_score]
model 0: [ExtraTreesClassifier]
----
MEAN: [0.86173333]
model 1: [XGBClassifier]
----
MEAN: [0.84853333]
model 2: [KNeighborsClassifier]
----
MEAN: [0.86053333]
model 3: [DecisionTreeClassifier]
0%| | 0/15 [00:00<?, ?it/s]
Stacking 4 models: 100%|██████████| 15/15 [00:21<00:00, 1.63s/it]